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© Hybrid DNA synthesis of mature insulin-Wee growth factors. 
@ Msthods and compositions are provided for efficient 
production of human Instill n-Hke growth factor. Synthetic IGF 
I and IGF II genes are joined to leader and processing signals 
which provide for expression and secretion of the gene pro- 
duct In yeast Enhanced yields of the product may then be 
recovered from the nutrient medium. 

Yeast strains 5. atrevistae AB103 (pYlGH-10/1) and 
AB103 (pYIGF-IM0/1) were deposited etthe American Type 
Culture Collection on April 23. 1983 and granted Accession 
Nos. 20673 and 20674, respectively. 
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describe the sequence encoding for the a- factor and 
spacers between two of such sequences . 

SUMMARY OF THE INVENTION 
Methods and compositions are provided for the 
5 efficient production of mature human insulin-like 
growth factor (IGF) * In particular, expression of a 
"pre n -IGF I and tt pre"-IGF II in a yeast host facilitates 
secretion of the polypeptides into the nutrient medium. 
DNA constructs are generated by joining DNA sequences 
10 from diverse sources , including both natural and 

synthetic sources. The resulting DNA constructs are 
stably replicated in the yea6t and provide efficient/ 
high level production of processed "pre^polypeptides 
which may be isolated in high yield from the nutrient 
15 medium. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
DNA sequences capable of expressing human 
insulin-like growth factors (IGF I and II) are provided. 
These DNA sequences can be incorporated into vectors, 
20 and the resulting plasmids used to transform susceptible 
hosts* Transformation of a susceptible host with such 
recombinant plasmids results in expression of the 
insulin-like growth factor gene mid production of the 
polypeptide product* 
25 In particular, novel DNA constructs are 

provided for the production of the precursor polypeptides 
( u pre*— IGF I and "pre"-IGF II) in a yeast host capable 
of processing said precursor polypeptides and secreting 
the mature polypeptide product into the nutrient 
30 medium. The DNA constructs include a replication 

system capable of stable maintenance in a yeast host, 
an efficient promoter, a structural gene including 
leader and processing signals in reading frame with 
said structural gene, and a transcriptional terminator 
35 sequence downstream from the structural gene. 
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Optionally, other sequences can be provided for tran- 
scriptional regulation, amplification of the gene, 
exogenous regulation of transcription, and the like. 
By u pre M — IGF I and W pre"-IGF II, it is meant that the 
DMA . sequence encoding for the mature polypeptide is 
joined to and in reading frame with a leader sequence 
including processing signals efficiently recognized by 
the yeast host. Thus, "pre" denotes the inclusion of 
secretion and processing signals associated with a 
yeast host and not any processing signals associated 
with the gene encoding the polypeptide of interest. 

In preparing the DMA construct, it is necessary 
to bring the individual sequences embodying the replica- 
tion system, promoter, structural gene including leader 
and processing signals, and terminator together in a 
predetermined order to assure that they are able to 
properly function in the resulting plasmid. As described 
hereinafter, adaptor molecules may be employed to 
assure the proper orientation and order of the sequences. 

The IGP I and IGF II genes which are employed 
may be chromosomal DMA, cDNA, synthetic DMA, or combina- 
tions thereof. The leader and processing signals will 
normally be derived from naturally occurring DNA se- 
quences in yeast which provide for secretion of a poly- 
peptide. Such polypeptides which are naturally secreted 
toy yeast include a -factor, a- factor, acid phosphatase 
and the like. The remaining sequences which comprise 
the construct~including the replication system, promoter, 
and terminator, are well known and described in the 

literature. 

Since the various DNA sequences which are 
joined to form the DNA construct of the present invention 
will be derived from diverse sources, it will be 
convenient to join the sequences by means of connecting 
or adaptor molecules. In particular, adaptors can be 
advantageously employed to connect the 3' -end of the 
coding strand of the leader and processing signal 
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sequence to the 5 '-end of the IGF coding strand together 
with their respective complementary DNA strands. The 
leader and processing signal sequence may be internally 
restricted near its 3 '-terminus so that it lacks a 
5 predetermined number of base pairs of the coding 
region. An adaptor can then be constructed so that, 
when joining the leader and processing sequence to the 
IGF coding strand, the missing base pairs are provided 
and the IGF coding strand is in the proper reading 

10 frame relative to the leader sequence* The synthetic 
IGF coding region and/or the adaptor at its 3 '-end will 
provide translational stop codons to assure that the 
C- terminus of the polypeptide is the same as the 
naturally occurring C- terminus. 

15 The adaptors will have from about 5 to 40 

bases, more usually from about 8 to 35 bases, in the 
coding sequence and may have either cohesive or blunt 
ends, with cohesive ends being preferred. Desirably, 
the termini of the adaptor will have cohesive ends 

20 associated with different restriction enzymes so that 
€Ee adaptor will selectively link two different DMA 
sequences having the appropriate complementary cohesive 
end. 

The subject invention will be illustrated 
25 with synthetic fragments coding for IGF I and IGF II 
joined to the leader and processing signals of yeast 
a -factor* The yeast cr-f actor may be restricted with 
Hind i 1 1 and Sail . Hind i 1 1 -cleaves in the processing 
signal of the o- factor precursor, cleaving 3 V to the 
30 second base din the coding strand of the glu codon, 
while the Hindi 1 1 recognition sequence completes the 
glu codon, encodes for ala and provides the first 5' 
base of the ami no- terminal trp codon of mature a -factor. 
With reference to the direction of transcription of the 
35 a -factor gene, the Sai l site is located upstream of the 
transcriptional terminator. 
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The synthetic genes coding for IGF will have 
nucleotide sequences based on the known amino acid 
sequences of the IGF I and IGF II polypeptides. 
Preferably, the synthetic sequences will employ codons 
5 which are preferentially utilized by the yeast host, 
e.g., based on the frequency with which the codons are 
found in the genes coding for the yeast glycolytic 
enzymes. Conveniently, the synthetic sequence will 
Include cohesive ends rather than blunt ends for inser- 
10 tion into a restriction site in a cloning vehicle. 
Furthermore, restriction sites will be designed into 
the synthetic sequences using silent mutations in order 
to generate .fragments which may be annealed into 
sequences capable of producing IGF I/IGF II hybrid 
15 peptide molecules. 

In the examples, the synthetic fragments are 
provided with cohesive ends for EcoR I and inserted into 
the EcoR I site in pBR328. Usually, the synthetic se- 
quence will include additional restriction sites proxi- 
20 mal to each end of the polypeptide coding region. Such 
interior restriction sites are selected to provide 
precise excision of the coding region from the cloning 
vehicle and for joining to adaptors so that the final 
DMA construct, including the leader and processing 
25 signals and coding region, are in proper reading frame, 
and in proper juxtaposition to a transcription termina- 
tor. Preferably, the restriction sites will have the 
recognition sequence offset from the cleavage site, 
where cleavage is directed proximal to the coding 
30 region and the recognition site is lost. This allows 
cleavage precisely at each end of the coding region 
regardless of the nucleotide sequence. Hqa l sites are 
provided in the examples. 

In preparing the synthetic gene, overlapping 
35 single stranded DNA (ssDMA) fragments are prepared by 
conventional techniques. Such ssDNA fragments will 
usually be from about 10 to 40 bases in length. 
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Although considerably longer fragments may be employed, 
the synthetic yield decreases and it becomes more 
difficult to assure that the proper sequence has not 
been inadvertently degraded or altered. After the 
5 ssDNA fragments have been synthesized, they are joined 
under annealing conditions with cosaplementary base 
pairing assuring the proper order. The ends of the 
fragments are then ligated, and the resulting synthetic 
UNA fragment cloned and amplified, usually in a bacterial 

10 host such as E. coli . As previously indicated, the 

synthetic structural gene may be provided with cohesive 
ends complementary to a suitable restriction site in 
the cloning vehicle of interest and internal recognition 
sites which allow for precise excision of the coding 

15 region. After cloning and amplification of the synthetic 
sequences, us able- quantities of the sequences may be 
excised, usually at the internal restriction sites on 
either end of the IGF coding region. 

Conveniently, the promoter which is employed 

20 may be the promoter associated with the leader and 
processing sequence. In this manner, a 5 '-portable 
element, which contains both the promoter and the 
leader sequence in proper spatial relationship for 
efficient transcription, may be provided. By further 

25 including a" transcriptional terminator, a "cassette 0 
consisting of promoter/leader - restriction site(s) - 
terminator is created, where the IGF coding region may 
be inserted with the aid of adaptors. Usually, such 
cassettes may be provided by isolating a DNA fragment 

30 which includes an intact gene from a yeast host and the 
upstream and downstream transcriptional regulatory se- 
quences of the gene, where the gene expresses a poly- 
peptide which is secreted by the host. 

Alternatively, one may replace the naturally 

35 occurring yeast promoter by other promoters which allow 
for transcriptional regulation. This will require 
sequencing and/or restriction mapping of the region 
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upstream from the leader sequence to provide for intro- 
duction of a different promoter. In some instances, 
it may be desirable to retain the naturally occurring 
yeast promoter and provide a second promoter in tandem, 
either upstream or downstream from the naturally 
occurring yeast promoter. 

A vide variety of promoters are available or 
can be obtained from yeast genes. Promoters of partic- 
ular interest include those promoters involved with 
enzymes in the glycolytic pathway ,. such as promoters 
for alcohol dehydrogenase, glyceraldehyde-3 -phosphate 
dehydrogenase, pyruvate kinase, triose phosphate 
isomerase, pbosphoglucoisomerase, phosphofructokinase, 
etc. By employing these promoters with regulatory 
sequences, such as enhancers, operators, etc., and 
using a host having an intact regulatory system, one 
can regulate the expression of the processed "pre" -IGF. 
Thus, various small organic molecules, e.g. glucose, 
may be employed for the regulation of production of the 
desired polypeptide. 

One may also employ temperature-sensitive 
regulatory mutants which allow for modulation of 
transcription by varying the temperature. Thus, by 
growing the cells at the non-permissive temperature, 
one can grow the cell*- to high density, before changing 
the tenqperature in order to provide for expression of o 
the "pre^-polypeptides for IGF I and IGF II. 

Other capabilities may also be introduced 
into the construct. For example, some genes provide 
for amplification, where upon stress to the host, not 
only is the gene which responds to the stress amplified, 
but also flanking regions. By placing such a gene 
upstream from the promoter, coding region and the other 
regulatory signals providing transcriptional control of 
the w pre" -polypeptide, and stressing the yeast host, 
plasmids may be obtained which have a plurality of 
repeating sequences, which sequences include the 
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"pre "-polypeptide gene with its regulatory sequences. 
Illustrative genes include metallothioneins and dihydro- 
folate reductase. 

The construct may include in addition to the 
5 leader sequence fragment, other DMA homologous to the 
host genome. If it is desired that there be integration 
of the IGF gene into the chronosoBe(s), integration can 
be enhanced by providing for sequences flanking the IGF 
gene construct which are homologous to host chromosomal 
10 DNA. 

The replication system which is employed will 
be recognized by the yeast host. Therefore, it is 
desirable that the replication system be native to the 
yeast host. A number of yeast vectors are reported by 

15 Botstein et al.. Gene (1979) 8:17-24. Of particular 
interest are the YEp plasmids, which contain the 2pm 
plasmid replication system. These plasmids are stably 
maintained at multiple copy number. Alternatively or 
in addition, one may use a combination of ARS1 and 

20 GEN4, to provide for stable maintenance* 

After each manipulation, as appropriate, the 
construct may be cloned so that the desired construct 
is obtained pure and in sufficient amount for further 
manipulation* Desirably, a shuttle vector (i.e., 

25 containing both a yeast and bacterial origin of repli- 
cation) may be eaployed so that cloning can be performed 
in prokaryotes, particularly E* coli. 

The plasmids may be introduced into^the yeast 
host by any convenient means, employing yeast host 

30 cells or spheroplasts and using calcium precipitated 

SENA for transformation or liposomes or other conventional 
techniques* The modified hosts may be selected in 
accordance with the genetic markers which are usually 
provided in a vector used to construct the expression 

35 plasmid* An auxotrophic host may be employed, where 
the plasmid has a gene which complements the host and 
provides prototrophy. Alternatively, resistance to an 
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appropriate biocide, e.g. antibiotic, heavy metal, 
toxin, or the like, may be included as a marker in the 
plasmid. Selection may then be achieved by employing a 
nutrient medium which stresses the parent cells, bo as 
5 to select for the cells containing the plasmid. The 
pl asmi d containing cells may then be grown in an 
appropriate nutrient medium, and the desired secreted 
polypeptide isolated in accordance with conventional 
techniques. She polypeptide may be purified by chroma- 

10 tography, filtration, extraction, etc. Since the 
polypeptide will be present in mature form in the 
nutrient medium, one can cycle the nutrient medium, 
continuously removing the desired polypeptide. 

The following examples are offered by way of 

15 illustration and not by way of limitation. 

EXPERIMENTAL 
Nucleotide sequences for human insulin-like 
growth factors I and II (IGF I and II) comprising 
preferred yeast codons were' devised based on the amino 
20 acid sequences reported in Rinderknecht and Humbel 

(1978) J.Biol. Chem. 253:27^9-2776 and Rinderknecht and 
Humbel (1978) FEBS Letters 89:283-286, respectively. 
The sequences (with the coking strands shown 5 V to 3') 
are as follows: 



10 



01 23228 



IGF I 



Coding Region 

AsnSerT hrI^i^et} siyProGlnThrIeuCysGlyAlaGluLeuValAspAlaLenGln 
• - AATTCSSlTATG 3GTCCAGAAACCTTGTGTGGT6CTGAATTGGTCGATGCTITGCAA 



|gctgcgaatacccag(^ctttggaac^cacca^ 

- — Hgal 



EcoRI 



PhcValCysGlyAspArgGlyPheTyrPheAsiaysProThrGlyTyrGlyScrSerSer 

TTCGITIGTGGTGACAGAGGTTTCIACTTCAACAJIGCCAACCG 
MGCAAACACCACTGTCTCCAAAGATGAAGTTGTTCGGTTGGCCAATGCCAAGM 



ArgArgAlaPiroGlnThrGlylleVAMspGluCysCysPlieArgSerCyBAspIenArg 

AGMGAGCTCCACAMCCGGTATCGTTGACGAATGTTGTTTCA 

TCTTCTCGAGGTGTTTG€CCA1A6CAACTGOT 

Hgal EcoRI 
ArgI«nGluMetTi7CysAlaProI^yBProAlaLy8S]BtAiaOP MetArgAig 
AGATTGGAAATGTACTGTGCTCCATTGAAGCCAGCTAAi 




TCTMCCTITACATGACACGAGCTAACTTCGGTC^ 

Coding Region «| 
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Coding Region 

AsnSerThi^eu^tKlaTyrArgProSerGluThrLeuCysGlyClyGluLeuValAsp 



5 ' *JAACTCGACG7to 

pCTGCGAATACCGAATjGTCTGGTAGGCTTTGGAACACACCACCACTTAACCAGCTG 
EcoRI Hgal 

IhrleuGlnPheValCysGlyAspArgGlyPheTyrPheSerArgProAlaSerArgVal 
ACCTTGCAATTCGTTXGIGGTGACAGAGGTTTCTACTTCTCCAGACCAGCTTCCAGAGTT 
TGGAACGTTAAG(^UUCACCACTGTCTCCAAAGATGAAGAGGTCTGGTCGAAGGTCTCAA 

SerArgArgSerArgGlylleValGluGluCysCysPheArgSerCyBAspIeuAlaLeu 
TCTAGAAGATCCAGAGGTATCGTTGAAGMTGTTGTTTCAGATCTTGTGACTTGGCTTTG 
AGATCTTCTAGGTCTCCATAGCAACTTCTTACAACAAAGTCTAGAACACTGAACCGAAAC 

Hgal EcoRI 
LeuGluThrTyrCysAlaThrProAlaLysEjerGluOP MetArgAxfc 
TTGGAAACCTACTGTGCIACCCCAGCTAAG1 CTGAAT GAATGCGTCG -3 ' 
AACCTTTGGATGACACGATGGGGTCGATTCAGACn ACTTACGCAGC TTAaI 

Coding Region « 



The sequences are provided with EcoRI cohesive ends at 
both ends. Coding for IGF I begins at base 16 of the 
coding strand and ends at base 225. Hga l restriction 
sites are located at each end of the IGF I coding 
region. The Hga l recognition Bites (5*-GACGC-3 « ) lie 
outside of the IGF I coding region, i.e. between the 
end of the synthetic sequence and the Hga l cleavage 
site. The IGF II synthetic sequence is similarly 
constructed with the coding strand beginning at base 16 
and terminating at base 219. 

A synthetic DNA fragment for IGF I having the 
sequence just described for IGF I was prepared by 
synthesizing 20 overlapping ssDNA segments using the 
phosphoramidite method (see co-pending application. 
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Ser. No. 457,412, filed January 12, 83). The bbDNA 
sequences were as follows: 



Designation Sequence 

A AATTCGACGCTTATGG 

B-I-l GTCCAGAAACCTTGTGTGGT 

C-I-2b GCTGAATTGGTC 

C-I-2a GATGCTTTGCAATTCCT 

I> TOGTCGTGACAGAGGTTTCTACTTC 

1-3 AACftACXX^CCGGTIACGGTTCCTCTTC 

E-I-4 IAGAAGAGCTCCACAAACCGGTAICGTT 

F-I-5 GAOGAATGTTGTITTCAGATCTTCT^CTTG 

G-I-6 AGAAGAITGGtAAATGTACTGTGCT 

1-7 CCAITGAAGCCAGCTaAGICI 

H-I-8 GCTTGAATGCGTCG 

J-I-9 CTTTCTGGMXICAIAAGCGTCG 

K-I-10 AAAGCATCG^CAATTCAGCACCACACAAG 

L CTCTCTaKXttCAAACGAATTTC 

M-I-ll AACCGGTTGGCTTGTTGAAC3TAGAAAC 

1-12 TGGAGCTCTTCTAGAAGAAGAACCGT 

1-13 GAAAC^ACATTCGTCAft£^3AIACCGGTTTG 

N-I-14 CCAATCTTCTCAAGTCACAAGATCT 

1-15 GGCTTCAATGGAGCACAGTACATTT 

1-16 AATTCX^CGCATTCAAGCAGACTTAGCT 



The ssDHA segments were joined as follows: 
50 pmoles of each segment except A and 1-16 were 
5 • -phosphorylated with T4 polynucleotide kinase. 
50pmoles all segments were then annealed in a single 
step as an 18ul pool by cooling froa 95° to 25° over 
1.5 hrs. Ligation was achieved in a reaction volume of 
30ul containing ImM ATP, IQmM DTT, lOOmM tris-HCl, pH 
7.8, lOmM HgCl 2 , lpg/ml spermidine and T4 ligase. 
The appropriate double-stranded fragment resulting froa 
the order and pairing of fragments shown in Figure 1 
was purified on a 7% native polyacrylamide electrophore- 
sis gel. 
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Figure 1 



IGf 1 AKNEA* IN6 AND LIGATION SCHEME 



Jblid^zL?b^"2a_0__ 1-3 ..E-l-4 . F-I-5. 6-1-6 . 1-7 . H-l-B 



-1*9 K-I-10 



Ml 



1-12 1-13 M-I-14 1-15 . 1-16 
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A SENA sequence for IGF 21 was similarly 
synthesized. The following additional wsDNA fragments 
were prepared. 

Sequence 
CXXACAGACaerCCCAAACCCTGXGTGGX 
GCHX1AAITGCXCCACACCTXCCAA3TCGT 
TCCAGACCACCTTCaVGAGTTTCT 
AGAAGATCCAGAGGXAXCGIT * 
GAAGAATGTTGTTTCAGAXCTTGTGACTTG 
CCTTTGTTGCAAACCTACTGTGCT 
ACCCCAGCTAAGTCTGAATGAArGCCTCG 
G1TICCGATGGTCTCTAAGCCAXAACCGT CG 
AAGGTGTCGACCAATXCACCACCACACAAG 
AAGCTGGTCTGGAGAAGTAGAAAC 
CTCGATCTTCTAGAAACTCTGG 
GAAACAACATTCTTCAAOOftTACCT 
TTCCAACAAAGCCAAGTC ftCAAGATCT 
AGACTXAGCTGGCGTAGCACAGTAGCT 
AATTCGACGCATTCATTC • 



20 200 pmoles of these ebDKA fragments and 

fragments A, I and L were joined in a similar manner as 
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above, wherein A and 11-15 were not phosphorylated 
resulting in the following ordering and pairing of 
segments . 



Figure 2 

IGF II ANNEALING AND LIBATION SCHEME 
— B -"- 1 . f-"- 2 . P "-3 E-II-4 F-II-5 G-II-6 H-II-7 
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J-II-8 K-II-9 L M-II-10 11-11 H-12 N-II-13 ImT Tv+S* 



The synthetic DMA sequences were inserted 
5 into the EcoRl site of P BR328 to produce plasmids 
P328IGF I and P328IGF II. After cloning, the IGF 
coding strands were excised using Egal, 

Synthetic oligonucleotide adaptors were then 
ligated to the Bgal restriction fragments. For IGF I, 
10 the adaptors had the following sequences: 

a) 5 ■ -AGCTGAAGCT-3 1 

3 1 -CTTGGACCAGG-5 * 
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b) 5 • -CTGCTTGATAA6—3 * 

3 1 -ACIATTCAGCT-5 * 



With orientation with respect to the coding strand, the 
3 '-end of the first adaptor, a), is complementary to 
the 5» Egai cleavage site on the IGF I synthetic 
sequence, while the 5 '-end of the first adaptor provides 
a Hxndlli cohesive end. The second adaptor, b), is 
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complementary at its 5 '-end to the Hga l cleavage site 
at the 3* -end of the IGF I sequence, while the 3* -end 
of the adaptor provides a Sai l cohesive end. 

For IGF II, the adaptors had the following 

sequences : 

C) 5 ' -AGCTGAAGCT-3 1 

3 1 -CTTCGACGAAT-5 1 

d) 5 * -CTGAATGATAAG-3 ' 

3 1 -ACTATTCAGCT- 5 • 

With orientation as above the first adaptor, c), is 
complementary at its 3 '-end to the 5'- Hgal cleavage 
site in the IGF II synthetic sequence, while the 5' -end 
provides a Hind i I I cohesive end. The second adaptor, 
d), is complementary at its 5' -end to the second Hga l 
cleavage site in the IGF II synthetic sequence and 
provides a Sail cohesive end at its 3* -end. 

The synthetic fragments and adaptors joined 
thereto were purified by preparative gel electrophoresis 
and ligated to lOOng of pAB113 which had been previously 
digested to completion with endonucleases Hind i 1 1 and 
Sai l. pAB113 was derived from pAB112 by deletion of 
the three 63 bp Hindi 1 1 fragments. pAB112 is a plasmid 
containing a 1.8kb BcoR I fragment with the yeast 
a- factor gene cloned in the BcoR I site of pBR322 in 
which the Hind i 1 1 and Sai l sites had been deleted. 
pAB112 was derived from plasmid pABlOl which contains 
the yeast a-f actor gene as a partial Sau3 A fragment 
cloned in the BamH I* Bite of plasmid YEp24. pABlOl was 
■ obtained by screening a yeast genomic library cloned in 
YEp24 using an enzymatically P radiolabeled synthetic 
oligonucleotide probe homologous to the published 
cr-f actor coding region (Kurjan and Herskowitz, Abstracts 
1981 Cold Spring Harbor meeting on the Molecular 
Biology of Yeasts, page 242). 

The resulting mixtures were used to transform 
E. coli HB101 cells, and plasmids pAB113-IGF-I and 
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pABU3-IGF-II were obtained for IGF I and IGF II r 
respectively . 

Plasmids pAB113-IGF-I and pABU3-IGF-II (5pg 
each) having the IGF I and IGF II structural genes, 
5 respectively, were digested to completion with EcoR I 
and the resulting fragments were ligated to an excess 
of EcoRI -BamHI adaptors and digested with BamH I. The 
resultant 1.8fcb BamH I fragments were isolated by 
preparative gel electrophoresis and approximately lOOng 

10 of each fragment was ligated to lOOng of pCl/1, which 
had been previously digested to completion with BamH I 
and treated with alkaline phosphatase. 

Plasmid pCl/1 is a derivative of pJDB219, 
Beggs, Nature (1978) 275:104, in which the region 

15 corresponding to bacterial plasmid pMB9 in pJDB219 has 
been replaced by pBR322 in pCl/1. Each ligation 
mixture was used to transform E. coli BB101 cells. 
Trans f ormants were selected by ampicillin resistance 
and their plasmids analyzed by restriction endonuclease 

20 digestion. DMA from one selected clone for each 

structural gene, IGF I or IGF II, (pYIGF-I-10/1) or 
(pYIGF-II-10/1), respectively, was prepared and used to 
transform yeast AB103 cells. Trans f ormants were 
selected by their Leu + phenotype. 

25 Two cultures (5 and 9 liters) of yeast strain 

AB103 (q f pep 4-3, leu 2-3, leu 2 -112 , ura 3-52, his 
-4r 580- ). transformed with plasmid (pYIGF-I-10/1) were 
grown at 30°C in -leu medium to saturation (optical 
density at 650nm of 5) and left shaking at 30°C for an 

30 additional 12 hr period. Cell super n a t ants were 

collected from each culture by centrifugation and the 
IGF I concentrated by absorption on an ion exchange 
resin (Biorex-70 available from Bio-Rad Laboratories, 
Inc., Richmond, California). After elution with lOmM 

35 HCl in 80% ethanol, the IGF I fractions (0.4ml and 3ml, 
respectively) were assayed for ;otal protein concentra- 
tion and IGF I concentration- f ihe protein assay was 
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the Coomassie Blue assay available from Bio-Rad Labora- 
tories, Inc., Richmond, California. The IGF I assay 
was a conventional competitive radioimmunoassay employing 
radiolabeled IGF I. The results were as follows: 

Volume After Total 
Trial No. Volume Concentration Protein IGF-I 

1 5 liters 0.4ml 5. (tag/ml 3.75mg/ml 

2 9 liters 3.0ml 2.9mg/ml 3.00mg/tal 

A bioassay of IGF I, based on the synergystic effect of 
the peptide to promote the response of pigeon crop sac 
to prolactin (Anderson et al. (1983) in Somatomedins/ 
Insulin-Like Growth Factors, Spencer, E.M., ed. , Walter 
deGruter, Berlin) reveals that the IGF I product of 
these preparations has activity equivalent to authentic 
IGF I isolated from human serum. 

Cultures of AB103 (pYIGF-II-10/1 ) grown 
similarly to the above and assayed using a human 
placental membrane radio receptor assay (Spencer et al. 
(1979) Act. Endocrinol. 91:36-48) for IGF II revealed 
3.9 units/ml where normal human serum possesses 1 
unit/kiil. 

In accordance with the subject invention, 
novel constructs are provided which may be inserted 
into vectors^*** provide "for expression of human insulin- 
like growth factor I to provide processing and secretion. 
Thus, one can obtain a polypeptide having the identical 
sequence to the naturally occurring human insulin-like 
growth factor I. By virtue of providing for secretion, 
greatly enhanced yields can be obtained based on cell 
population and subsequent preparative operations and 
purification are simplified. 
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Although the foregoing invention has been 
described in some detail by way of illustration and 
example for purposes of clarity of understanding, it 
will be obvious that certain changes and modifications 
nay be practiced within the scope of the appended 
claims* 



WHAT IS CLAIMED IS: 
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1. A method for efficient production of a 
human insulin-like growth factor (IGF) in a suitable 
host, said method comprising growing host cells contain- 
5 ing a functional DMA construct having the gene encoding 
said IGF joined in proper reading frame with secretory 
leader and processing signal sequences recognized by 
said host to form a structural gene downstream from and 
under the transcriptional regulatory control of a 
10 transcriptional initiation region in a vector compatible 
with said host, and isolating secreted human IGF. 

2- A method as in claim 1, wherein the IGF 
gene is a synthetic gene having codons preferentially 
utilized by the host. 

15 3. A method as in claim 1, wherein the 

human IGF gene is a synthetic IGF I gene having the 
following nucleotide sequence: 

GlyProGluThrLeuCysGlyAlaGluLeuValAspAlaLeuGln 
5 • -GGTCXAGAAACCTTGTGTGGTGCTGAATTGGTCGATGCTTTGCAA 
20 CCAGGTCTTTGGMCACACCACGACTTMCCAGCTACGAAACGTT 

PheValCysGlyA^ArgGlyPheTyxPh^ 

TTCGTTTGTGGTGACAGAGGTTTCTAC^ 

MGCAAACACCACTGTCTCCAAAGAT<^ 

ArgArgAlaProGlnThrGlylleValAspGluCysCysPheArgSerCysAspLeuArg 
25 AGAAGAGCTCCACAAACCGGTATCGTTGACGAATGTTGTITCAGATCn 
TCTTCTCGAGGTGTTTGGCCATAGCMCTGCro 

ArgLeuGluMetTyrCysAlaProLeuLysProAlaLysSerAla 
AGATTGGAAATGTACTGTGCTCCATTGAAGCCAGCTMGTCTGCT-3 1 
TCTAACCTTTACATGACACGAGGTAACTTCGGTCGATTCAGACGA 
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4. A method as in claim 1, wherein the 

human IGF gene is a synthetic IGF II gene having the 

following nucleotide sequence: 

AlaTyrArgProSerGluThrLeuCysGlyGlyGluLeuVilABp 

5 5 * -GCTTACAGACCATCCGAAACCTTGTGTGGTGGTGAATTGGTCGAC 

CGMTGTCTGGTA6GCTTT6GAACACACCACCACTTAACCAGCT6 

Thrl^uGlnPheValCysGlyAspArgGlyPheTyrRieSerArgProAlaSerArgVal 

ACCTTG(^TTCGTTTGTGGTGACAGAGGTTTCTACTT^ 
TGGMCGTTAAGCAAACACCACTGTCTCCAMGATGAAGAGGTCT^^ 

10 SerArgArgSex^gGlylleValGluGluCysC^PlieArgSerCysAspIeaAlaLeu 
TCTAGAAGATCCAGAGGTATCGTTGAAGAATGTTGTTTCAGATCTTGT 
AGATCCTCTAGGTCTCCATAGCaACTTCTTACAACAAAG 

LeuGluThrTyrCysAlaThrProAlaLysSerGlu 
TTGGAMCCTACTGTGCTACCCCAGCTAAGTCTGAA-3 « 
15 AACCTTTGGATGACACGATGGGGTCGATTCAGACTT 

5. A method as in claim 1, wherein the host 
is a yeast. 

6. A method as in claim 1, wherein the host 
is a yeast and the vehicle is a derivative of the 2pm 

20 plasmid. 

7. A method for expressing human insulin- 
like growth factor (IGF) in yeast resulting in secretion 
of said IGF promoted by yeast secretion and processing 
signals, said method comprising: 

25 preparing a first UNA fragment containing a 

first DMA sequence encoding for IGF and having the 
5 « -terminus of the coding strand at or downstream from 
the first base of said UNA sequence ; 

preparing a second DNA fri-.gment containing a 

30 second DNA sequence encoding for yea 5t-recogniz able 
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secretion and processing signals and having the 3'- 

terminus of the coding strand at or upstream from the 
terminal base of said processing signal, wherein at 
least one of said first and second fragments has its 
5 terminus internal to the coding sequence resulting in 
missing base pairs; 

joining said first and second DMA fragments 
by means of an adaptor encoding for the missing base 
pairs of said first and second DNA fragments/ wherein 
10 said first and second SNA fragments are in the same 
reading frame; 

providing a termination codon downstream of 
and adjacent to the first fragment to provide a U pre u -IGF 
gene; and 

15 cloning said M pre w ~IGF gene in an expression 

vector in yeast, wherein said B pre M -IGF is secreted and 
processed* 

8* A method according to claim 7, wherein 
said IGF is IGF I. 

20 9* A method according to claim 7, wherein 

said IGF is IGF II . 

10. A method according to claim 7, wherein 
said expression vector includes a replication system 
recognized by bacteria. 

25 11. A method according to claim 7, wherein 

said yeast replication system comprises the 2pm plasmid 
or portion thereof. 

i 

12. A DNA construct comprising a coding 
sequence coding for human insulin-like growth factor 
30 (IGF) under transcriptional regulation of regulatory 
signals utilized by yeast. 
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13. A DNA construct according to claim 12, 
wherein said IGF coding sequence codes for IGF I* 

14. A DMA construct according to claim 12, 
wherein said IGF coding sequence codes for IGF II. 

15. A DNA construct according to claim 12 
including a replication system recognized by yeast* 

16. A DNA construct according to claim 12, 
where at least a portion of said coding sequence has 
codons preferentially utilized by yeast and different 
from human codons. 

17. A DNA construct according to claim 12, 
wherein said^ construct includes a replication system 
utilized by bacteria. 

18. A DMA construct according to claim 15 
wherein said yeast replication system comprises the 2pm 
plasmid or portion thereof. 



19. A DNA construct according to claim 12, 
including the sequence: 
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GlyProGluThrLeuCysGlyAlaGluLeuValAspAlaLeuGln 
5 1 -GGTCWGAAACCTTGTGTGGTGCTGAATTGGTCGATGCTTTGCAA 
CCAGGTCTTTGGAAC^CACCACGACTTAACCAGCTACGAAACGTT 

PheValt^sGlyAspArgGlyPheTyrPheAsnlysProThrGlyTyrGlySerSerSer 
5 TTCGTTTGTGGTGACAGAGGTTTCTACTTCMCAAGCCAACC^ 
AAGCAMCACC&CTGTCTC(^MGATGAAGTTGTTCGGTTGGCC^ 

ArgArgAlaProGlnThrGlylleValAspGluCysCysPlieArgSerCyBAspLenArg 

AGAAGAGCTCCACAAACCGGTATCGTTGACGAATGTTGTTTCAG^ 

TCT3CTCGAGGTGTTTGGCCAIAGCAACTGCTIACAA 

10 ArgLeuGluIletTyrCyBAlaProLenLyEProAlalysSerAla 

AGATTGGAAATGTACTGTGCTCCATTGAAGCCAGCTAAGTCTGCT-3 ' 
TCl^eeTTTACATGACACGAGGTM(rrrCGGTCGATTCAGACGA — 

20. A DNA construct according to claim 12, 
including the sequence: 
15 AlaTyrArgProSerGluThrleuCysGlyGlyGluLenValAsp 
5 * -GOTACAGACCATCCGAMCCTTGTGTGGTGGTGAATTGGTCGAC 
CGAATGTCTGGTAGGCTTTGGMCACACCACCACTTAACCAGCTG 

ThrLeuGlnPheValCysGlyAspArgGlyPheTyrPheSerArgProAlaSerArgVal 
ACCTTGCAATTCGTTTGTGGTGACAGAGGTTTCTACTTC^ 
20 TGGAACGTTMGCAAACAC&CTGTCTCCAAAGATGMGAGG^ 

SerArgArgSerArgGlylleValGluGluC^CysPheArgSerCyBAspI^uAlaLeu 

TCTAGMGATCCAGAGGTATCGTTGAAGAATGTTGTTTCAGATCTTG^ 

AGATCTTC1AGGTCTCCATAGCMCTTCTO 

LeuGluThrTyrCysAlaThrProAlaLysSerGlu 
25 TTGGAAACCTACTGTGCTACCCCAGCTAAGTCTGAA-3 ' 
AACCTTTGGATGACACGATGGGGTCGATTCAGACTT 
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21. A DNA construct comprising a sequence 
coding for human insulin-like growth factor joined to 
transcriptional and translational regulatory sequences 
for expression of a cellular host in culture. 



